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ABSTRACT 

Community detection in online social networks has been a 
hot research topic in recent years. Meanwhile, to enjoy more 
social network services, users nowadays are usually involved 
in multiple online social networks simultaneously, some of 
which can share common information and structures. Net¬ 
works that involve some common users are named as multi¬ 
ple “partially aligned networks”. In this paper, we want to 
detect communities of multiple partially aligned networks 
simultaneously, which is formally defined as the “Mutual 
Clustering” problem. The “Mutual Clustering” problem is 
very challenging as it has two important issues to address: 
(1) how to preserve the network characteristics in mutual 
community detection? and (2) how to utilize the informa¬ 
tion in other aligned networks to refine and disambiguate the 
community structures of the shared users? To solve these 
two challenges, a novel community detection method, MCD 
(Mutual Community Detector), is proposed in this paper. 
MCD can detect social community structures of users in 
multiple partially aligned networks at the same time with 
full considerations of (1) characteristics of each network, 
and (2) information of the shared users across aligned net¬ 
works. Extensive experiments conducted on two real-world 
partially aligned heterogeneous social networks demonstrate 
that MCD can solve the “Mutual Clustering” problem very 
well. 

Categories and Subject Descriptors 

H. 2.8 [Database Management]: Database Applications- 
Data Mining 
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Nowadays, online social networks which can provide users 
with various services have become ubiquitous in our daily 
life. The services provided by social networks are very di¬ 
verse, e.g., make new friends online, read and write com¬ 
ments on recent news, recommend products and locations, 
etc. Real-world social networks which can provide these 
services usually have heterogeneous information, involving 
various kinds of information entities (e.g., users, locations, 
posts) and complex connections (e.g., social links among 
users, purchase links between users and products). Mean¬ 
while, among these services provided by social networks, 
community detection techniques play a very important role. 
For example, organizing online friends into different cate¬ 
gories (e.g., “family members”, “celebrities”, and “classmates”) 
in Facebook and group-level recommendations of products 
in e-commerce sites are all based on community structures 
of users detected from the networks. 

Meanwhile, as proposed in [M| , to enjoy more 

social services, users nowadays are usually involved in mul¬ 
tiple online social networks simultaneously, e.g., Facebook, 
Twitter and Foursquare. Furthermore, some of these net¬ 
works can share common information either due to the com¬ 
mon network establishing purpose or because of similar net¬ 
work features [^. Across these networks, the common users 
are defined as the anchor users^ while the remaining non- 
shared users are named as the non-anchor users. Connec¬ 
tions between anchor users’ accounts in different networks 
are defined as the anchor links. The networks partially 
aligned by anchor links are called multiple partially aligned 
networks. 

In this paper, we want to detect the communities of each 
network across multiple partially aligned social networks si¬ 
multaneously, which is formally defined as the Mutual Clus¬ 
tering problem. The goal is to distill relevant information 
from another social network to compliment knowledge di¬ 
rectly derivable from each network to improve the clustering 
or community detection, while preserving the distinct char¬ 
acteristics of each individual network. The Mutual Clus¬ 
tering problem is very important for online social networks 
and can be the prerequisite for many concrete social network 
applications: (1) network partition: Detected communities 
can usually represent small-sized subgraphs of the network, 
and (2) comprehensive understanding of user social behav¬ 
iors: Community structures of the shared users in multiple 
aligned networks can provide a complementary understand¬ 
ing of their social interactions in online social networks. 

Besides its importance, the Mutual Clustering problem 






is a novel problem and different from existing clustering 
problems, including: (1) consensus clustering^ [ 8 IPI 2 T 1 


eral input clustering results about the same data; (2) multi¬ 
view clustering^ whose target is to partition objects 

into clusters based on their different representations, e.g., 
clustering webpages with text information and hyperlinks; 
(3) multi-relational clustering, |30[ which focuses on clus¬ 
tering objects in one relation (^led target relation) using 
information in multiple inter-linked relations; and (4) co¬ 
regularized multi-domain graph clustering [^, which relaxes 
the one-to-one constraints on node correspondence relation¬ 
ships between different views in multi-view clustering to un¬ 
certain’^ mappings. In [^, prior knowledge about the weights 
of mappings is required and each view is actually a homo¬ 
geneous network (more differences are summarized in Sec¬ 
tion]^. Unlike these existing clustering problems, the Mu¬ 
tual Clustering problem aims at detecting the communities 
for multiple networks involving both anchor and non-anchor 
users simultaneously and each network contains heteroge¬ 
neous information about users’ social activities. A more de¬ 
tailed comparison of Mutual Clustering problem with these 
related problems is available in Table 

Despite its importance and novelty, the Mutual Clustering 
is very challenging to solve due to: 

• Closeness Measure: Users in heterogeneous social net¬ 
works can be connected with each other by various 
direct and indirect connections. A general closeness 
measure among users with such connection informa¬ 
tion is the prerequisite for addressing the mutual clus¬ 
tering problem. 

• Network Characteristics: Social networks usually have 
their own characteristics, which can be reflected in the 
community structures formed by users. Preservation of 
each network’s characteristics (i.e., some unique struc¬ 
tures in each network’s detected communities) is very 
important in the Mutual Clustering problem. 

• Mutual Community Detection: Information in differ¬ 
ent networks can provide us with a more comprehen¬ 
sive understanding about the anchor users’ social struc¬ 
tures. For anchor users whose community structures 
are not clear based on in formation in one network, 
utilizing the heterogeneous information in aligned net¬ 
works to refine and disambiguate the community struc¬ 
tures about the anchor users. However, how to achieve 
such a goal is still an open problem. 

• Lack of Metrics. Mutual Clustering problem is a new 
problem and few existing metrics can be applied to 
evaluate the comprehensive performance of Mutual Clus¬ 
tering methods. 

To solve all these challenges, a novel cross-network com¬ 
munity detection method, MCD (Mutual Community De¬ 
tector), is proposed in this paper. MCD maps the complex 
relationships in the social network into a heterogeneous in¬ 
formation network and introduces a novel meta-path 
based closeness measure, HNMP-Sim, to utilize both direct 
and indirect connections among users in closeness scores cal¬ 
culation. With full considerations of the network character¬ 
istics, MCD exploits the information in aligned networks 
to rehne and disambiguate the community structures of the 
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Figure 1: Heterogeneous online social networks. 


multiple networks concurrently. Besides traditional qual¬ 
ity and consensus metrics, we define a novel general metric, 
IQC (Integrated Quality & Consensus), to evaluate the per¬ 
formance of mutual clustering methods. 

This paper is organized as follows: In Sectionwe formu¬ 
late the problem. Sectionj^introduces the Mutual Clustering 
methods. Section shows the experiment results. In Sec¬ 
tions and we give the related works and conclude this 
paper. 


2. PROBLEM FORMULATION 

The networks studied in this paper are Foursquare and 
Twitter. Users in both Foursquare and Twitter can fol¬ 
low other users, write tips/tweets, which can contain times¬ 
tamps, text content and location check-ins. As a result, both 
Foursquare and Twitter can be modeled as heterogeneous in¬ 
formation networks G = (U, E), where V = UUVUCUT U>V 
is the set of different types of nodes in the network and 
U, V, C, T, >V are the node sets of users, posts, loca¬ 
tion check-ins, timestamps and words respectively, while 
E = 8 s C 8 p 81 C 8 t C 8 w is set of directed links in the 
network and 8s ^ 8p, 81, 8t and 8w are the sets of social 
links among users, links between users and posts and those 
between posts and location-checkins, timestamps as well as 
words respectively. To illustrate the structure of the het¬ 
erogeneous network studied in this paper, we also give an 
example in Figure As shown in the hgure, users in the 
network can be extensively connected with each other by dif¬ 
ferent types of links (e.g., social links, co-location checkins 
connections). 

The multiple aligned networks can be modeled as ^ = 
iGset,Aset), where G^et = {G^^\G^^\ ... ,G^'^'>},\G,et\ = 
n is the set of n heterogeneous information networks and 

of undirected anchor links between different heterogeneous 
networks in Gset- In this paper, we will follow the dehnitions 
about “anchor user”, “non-anchor user”, “anchor link”, etc. 
proposed in 32 33 ^ and the constraint on anchor 
links is “one-to-one”, i.e., each user can have one account 
in on network. The case that users have multiple accounts 
in online social networks can be resolved with method in¬ 
troduced in [2Q, where these duplicated accounts can be 













Table 1: Summary of related problems. 
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aggregated in advance to form one unique vitural account 
in advance and the anchor links connecting these vitural 
accounts will be still one-to-one”. Different fr 
networks studied in this paper are all partially 

HI- 

Mutual Clustering Problem: For the given multiple aligned 
heterogeneous networks Q, the Mutual Clustering problem 
aims to obtain the optimal communities 
for G^‘^\ • • • , simultaneously, where 

= {Ui\ U 2 \ • • •, ^;[(!)} is a partition of the users set 

and Users in each detected cluster are 

more densely connected with each other than with users in 
other clusters. In this paper, we focus on studying the hard 
(i.e., non-overlapping) clustering of users in online social net¬ 
works. 



3. PROPOSED METHODS 

A co-regularization based multi-view clustering model was 
proposed in , which achieves the clustering results of nodes 
across multi-view by minimizing absolute clustering disagree¬ 
ment of all nodes (both shared and non-shared nodes). It 
cannot be applied to address the Mutual Clustering prob¬ 
lem, as in mutual clustering, we only exploit information 
across networks to refine the social community structures of 
anchor users only, while non-anchor users social community 
structures are not affected and can preserve their charac¬ 
teristics. To solve the Mutual Clustering problem, a novel 
community detection method, MCD, will be proposed in 
this section. By mapping the social network relations into 
a heterogeneous information network, we use the concept of 
social meta path to define closeness measure among users in 
Section 3.1. Based on this similarity measure, we introduce 
the network characteristics preservation independent clus¬ 
tering method in Section 3.2 and normalized discrepancy 
based co-clustering method in Section 3.3. To preserve net¬ 
work characteristics and use information in other networks 
to refine community structures mutually at the same time, 
we study the mutual clustering problem in Section 3.4. 

3.1 HNMP-Sim 

Many existing similarity measures, e.g., “Common Neigh¬ 
bor” [^, “Jaccard’s Coefficient” [^, defined for homogeneous 
networks cannot capture all the connections among users in 
heterogeneous networks. To use both direct and indirect 
connections among users in calculating the similarity score 
among users in the heterogeneous information network, we 
introduce meta path based similarity measure HNMP-Sim 
in this section. 


3.1.1 Meta Paths in Heterogeneous Networks 
In heterogeneous networks, pairs of nodes can be con¬ 
nected by different paths, which are sequences of links in 
the network. Meta paths 25 in heterogeneous networks, 
i.e., heterogeneous network meta paths (HNMPs), can cap¬ 
ture both direct and indirect connections among nodes in a 
network. The length of a meta path is defined as the number 
of links that constitute it. Meta paths in networks can start 
and end with various node types. However, in this paper, 
we are mainly concerned about those starting and ending 
with users, which are formally defined as the soeial HNMPs. 
The notation, definition and semantics of 7 different soeial 
HNMPs used in this paper are listed in Table To extract 
the social meta paths, prior domain knowledge about the 
network structure is required. 


3.1.2 HNMP-based Similarity 
These 7 different social HNMPs in Table |2] can cover lots 
of connections among users in networks. Some meta path 
based similarity measures have been proposed so far, e.g., 
the PathSim proposed in [^, which is defined for undi¬ 
rected networks and considers different meta paths to be of 
the same importance. To measure the social closeness among 
users in directed heterogeneous information networks, we ex¬ 
tend PathSim to propose a new closeness measure as follows. 
Definition 1 (HNMP-Sim): Let Vi{x 'w y) and Vi{x ^ •) 
be the sets of path instances of HNMP # i going from x to y 
and those going from x to other nodes in the network. The 
HNMP-Sim (HNMP based Similarity) of node pair (x, y) is 
defined as 


HNMP-Sim(x, y) 


■sp (\Pi{x y)\ + \Vi{yx)\\ 


where uji is the weight of HNMP # i and ^. uji — 1. In this 
paper, the weights of different HNMPs can be automatically 
adjusted by applying the technique proposed in [34] . 

Let Ai be the adjaeeney matrix corresponding to the HNMP 
# i among users in the network and Ai(m,n) = /c iff there 
exist k different path instances of HNMP # i from user m 
to n in the network. Furthermore, the similarity score ma¬ 
trix among users of HNMP # i can be represented as Si = 
(Di + Di) ^ (Ai + Af), where AJ denotes the transpose 
of Ai, diagonal matrices Di and Di have values Di(/,/) = 
^^Ai{l,m) and Di(/,/) = on ^heir diag¬ 

onals respectively. The HNMP-Sim matrix of the network 
which can capture all possible connections among users is 
represented as follows: 

S = ^ ^ ^ gji f A Dj) ^Ai + Ai^^. 

i i 












Table 2: Summary of HNMPs. 
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3.2 Network Characteristic Preservation Clus¬ 
tering 

Clustering each network independently can preserve each 
networks characteristics effectively as no information from 
external networks will interfere with the clustering results. 
Partitioning users of a certain network into several clusters 
will cut connections in the network and lead to some costs 
inevitably. Optimal clustering results can be achieved by 
minimizing the clustering costs. 

For a given network G, let C = {f/i, f/2, • • •, be the 
community structures detected from G. Term Ui — lA — Ui 
is dehned to be the complement of set Ui in G. Various 
cost measure of partition C can be used, e.g., cut and 
normalized cut [23] : 

k k 

cut{c) = \^s{Uum) = \^ Y. 

*=i *=i ueUi,veui 


N cut {C) 


l^S{Ui,Ui) 
2^ S{Ui,-) 


A cut{Ui,Ui) 

k 


where *S'(u, v) denotes the HNMP-Sim between u, v and S{Ui^ •) 
SiUiM) ^ S{Ui,Ui) + S{Ui,Ui). 

For all users in their clustering result can be repre¬ 
sented in the result confidence matrix H, where H = [hi, 
h2, ..., h„]'^, n = \U\, hi = (hi,1, hi,2 ,..., hi,/c) and hij 
denotes the confidence that Ui G h/ is in cluster Uj G C. The 
optimal H that can minimize the normalized-cut cost can 
be obtained by solving the following objective function [^ : 

min Tr(H^LH), 

H ^ ^ 

s.t. H^DH = I. 


where L = D —S, diagonal matrix D has D(z, i) — S(z, j) 
on its diagonal, and I is an identity matrix. 

3.3 Discrepancy based Clustering of Multiple 
Networks 

Besides the shared information due to common network 
construction purposes and similar network features , an¬ 
chor users can also have unique information (e.g., social 
structures) across aligned networks, which can provide us 
with a more comprehensive knowledge about the commu¬ 
nity structures formed by these users. Meanwhile, by max¬ 
imizing the consensus (i.e., minimizing the discrepancy^^) 
of the clustering results about the anchor users in multi¬ 


ple partially aligned networks, we rehne the clustering re¬ 
sults of the anchor users with information in other aligned 
networks mutually. We can represent the clustering results 
achieved in G^^'> and as C^^'> = • • • , 

and = {U^^\ U^\ • • • , respectively. 

Let Ui and Uj be two anchor users in the network, whose 
accounts in G^^^ and G^^^ are u[^\ uf‘\ u^p and u^p re¬ 
spectively. If users and u^p are partitioned into the 
same cluster in G^^^ but their corresponding accounts u^^^ 
and u^p are partitioned into different clusters in G^‘^\ then 
it will lead to a discrepancy between the clustering results of 
uP, uP, uP and uP in aligned networks G^^^ and G^^\ 
Definition 2 (Discrepancy): The discrepancy between the 
clustering results of Ui and uj across aligned networks G^^^ 
and G^‘^^ is dehned as the difference of confidence scores of Ui 
and Uj being partitioned in the same cluster across aligned 
networks. Considering that in the clustering results, the 
conhdence scores of uP and uP (uP and uP ) being par¬ 
titioned into (^^^^) clusters can be represented as vectors 
and O^P ^md hP) respectively, while the conh- 
dences that Ui and Uj are in the same cluster in G^^^ and G^‘^^ 
can be denoted as and h.^^(h^.^^)^. Formally, 

the discrepancy of the clustering results about Ui and Uj is 

dehned to be dij{C^^\C^‘^^) = 

if Ui,Uj are both anchor users; and dij{C^^\C^^^) = 0 oth¬ 
erwise. Furthermore, the discrepancy of and will 
be: 

n(l) n(2) 

i 3 

where and In the dehnition, non¬ 

anchor users are not involved in the discrepancy calculation, 
which is totally different from the clustering disagreement 
function (all the nodes are included) introduced in 

However, considering that d{C^^\C^‘^^) is highly dependent 
on the number of anchor users and anchor links between G^^^ 
and , minimizing d{C^^\C^‘^^) can favor highly consented 
clustering results when the anchor users are abundant but 
have no signihcant effects when the anchor users are very 
rare. To solve this problem, we propose to minimize the 
normalized discrepancy instead, which signihcantly differs 
from the absolute clustering disagreement cost used in [^. 
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Figure 2: An example to illustrate the clustering 
discrepancy. 


Definition 3 (Normalized Discrepancy) The normalized dis¬ 
crepancy measure computes the differences of clustering re¬ 
sults in two aligned networks as a fraction of the discrepancy 
with regard to the number of anchor users across partially 
aligned networks: 




(|A(l-2)|)(|yl(l,2)| _ !)• 


Optimal consensus clustering results of and G*-^^ will 
be yi),e 2 ): 


^(1) ^(2) _ ^rg ixiin Nd{C^^\C^^'’). 
c(i).c(2) 


Similarly, the normalized-discrepancy objective function 
can also be represented with the clustering results confidence 
matrices and as well. Meanwhile, considering that 
the networks studied in this paper are partially aligned, ma¬ 
trices and contain the results of both anchor users 
and non-anchor users, while non-anchor users should not be 
involved in the discrepancy calculation according to the def¬ 
inition of discrepancy. We propose to prune the results of 
the non-anchor users with the following anchor transition 
matrix first. 

Definition 4 (Anchor Transition Matrix): Binary matrix 
tT. 2 ) rp( 2 ,i)^ defined as the anchor transition matrix 

from networks to (or from G^‘^^ to where 

,p(i, 2 ) ^ T(^’2)(z,j) = 1 if {uf\uf^) G 

and 0 otherwise. The row indexes of (or are of 

the same order as those of (or Considering that 

the constraint on anchor links is “one-to-one” in this paper, 
as a result, each row/column of and contains 

at most one entry filled with 1 . 

In Figure we show an example about the clustering 
discrepancy of two partially aligned networks G^^^ and G^‘^\ 
users in which are grouped into two clusters {{ni, na}, {n 2 }} 
and {{ua^uc}^ {ub^ud}} respectively. Users ni, ua and ns, 
Uc are identified to be anchor users, based on which we can 
construct the “anchor transition matrices” and 

as shown in the upper right plot. Furthermore, based on the 
community structure, we can construct the clustering con- 


Algorithm 1 Curvilinear Search Method {CSM) 
Input: Xfc Cfc, Qk and function T 

parameters e = {p, 77 , (5, r, Tm, tm} 
Output: X/c+i, Cfc+i, Qfc+i 
1: Y(t) = (1+ fA)"' (I - fA)Xfc 
2: while J'(Y(t)) > Cfc + prT' ((Y(0))) do 

4: V)=(l+iA)-^(l-|A)X, 

5: end while 

6 : Xk+i=Yk(T) 

Qk-hi = TiQk + 1 

Gk+i — {r]QkGk + J^(Xfc+i)) / Qk+i 
T = max (min(T, tm), r^) 


fidence matrices^^ as shown in the lower left plot. To obtain 
the clustering results of anchor users only, the anchor tran¬ 
sition matrix can be applied to prune the clustering results 
of non-anchor users from the clustering confidence matrices. 
By multiplying the anchor transition matrices (T^^’^^)^ and 
(T(^’ 1 ))^ with clustering confidence matrices and 
respectively, we can obtain the “pruned confidence matrices” 
as show in the lower right plot of Figure Entries corre¬ 
sponding anchor users 7 / 1 , 773 , ua and uc are preserved but 
those corresponding to non-anchor users are all pruned. 

In this example, the clustering discrepancy of the par¬ 
tially aligned networks should be 0 according to the above 
discrepancy definition. Meanwhile, networks G^^^ and G^‘^^ 
are of different sizes and the pruned confidence matrices 
are of different dimensions, e.g., G and 

G To represent the discrepancy with 

the clustering confidence matrices, we need to further ac¬ 
commodate the dimensions of different pruned clustering 
confidence matrices. It can be achieved by multiplying one 
pruned clustering confidence matrices with the correspond¬ 
ing anchor transition matrix again, which will not prune 
entries but only adjust the matrix dimensions. Let = 
(T(L 2 ))Th(i) and ^ 

example, we can represent the clustering discrepancy to be 


h(f 



H(2) 



= 0 , 


where matrix HH^ indicates whether pairs of anchor users 
are in the same cluster or not. 

Furthermore, the objective function of inferring cluster¬ 
ing confidence matrices, which can minimize the normalized 
discrepancy can be represented as follows 


min 

s.t. 


H(i) (h^^))^ - (h( 2 ) 

||T(i.2)||^ (||T(i.2)||^_lj 


= I. 


where D^^\ D^^^ are the corresponding diagonal matrices of 
HNMP-Sim matrices of networks G^^^ and G^^^ respectively. 


3.4 Joint Mutual Clustering of Multiple Net¬ 
works 

Normalized-Cut objective function favors clustering re¬ 
sults that can preserve the characteristic of each network, 
however, normalized-discrepancy objective function favors 

























































consensus results which are mutually refined with informa¬ 
tion from other aligned networks. Taking both of these two 
issues into considerations, the optimal mutual clustering re¬ 
sults and of aligned networks and can be 
achieved as follows: 

arg min a ■ NcutiC’-^'’) + ■ NcutiC’-^^) + 0 ■ 

c(i),c( 2 ) 

where a, (3 and 0 represents the weights of these terms and, 
for simplicity, a, /3 are both set as 1 in this paper. 

By replacing Ncut(C^^^)^ Ncut{C^‘^^), with 

the objective equations derived above, we can rewrite the 
joint objective function as follows: 

min + P ■ 



s.t. = I, = I, 


where and matrices 

S(i), and are the HNMP-Sim matrices and 

their corresponding diagonal matrices defined before. 

The objective function is a complex optimization problem 
with orthogonality constraints, which can be very difficult 
to solve because the constraints are not only non-convex 
but also numerically expensive to preserve during iterations. 

Meanwhile, by substituting ^ and ^ 

with , we can transform the objective function into 

a standard form of problems solvable with method proposed 


min 

x(i),x( 2 ) 




||T(i.2)||^(||T(i.2)||^-i) 
s.t. = I, (X^^^)^X<^^ = I. 

where 

L(2 )((d(2 ))-5) and , 

rjp(2) _ j'fp(l,2)^T^fp(2,l)^T|'j^(2)^-| 

Wen et al. propose a feasible method to solve the 

above optimization problems with a constraint-preserving 
update scheme. They propose to update one variable, e.g., 
X^^\ while fixing the other variable, e.g., X^^\ alterna¬ 
tively with the curvilinear search with Barzilai-Borwein step 
method until convergence. For example, when X^^^ is fixed, 
we can simplify the objective function into 



rmnJ^(X),s.t.(X)^X = I, 


where X = X^^^ and *F(X) is the objective function, which 
can be solved with the curvilinear search with Barzilai-Borwein 
step method proposed in to update X until convergence 
and the variable X after the (k l)th iteration will be 


Xfc+i =Y(rfe),Y(Tfc)= (i+^a) ' (l- ^A)Xfc, 


Algorithm 2 Mutual Community Detector (MCD) 
Input: aligned network: Q = 

number of clusters in G^^^ and and 

HNMP Sim matrices weight: uj] 
parameters: e ^ {p,r],5,T,Tm,TM}\ 
function 3F and consensus term weight 0 

Output: 

1: Calculate HNMP Sim matrices, and 
2 
3 


Initialize X^^^ and X^^^ with Kmeans clustering results 
on and 


4: Initialize G^^ — 0, = 1 and G^^ — 0, = 1 

5: converge = False 
6: while converge = False do 
7: /* update X^^^ and X^^^ with CSM */ 


X 


( 1 ) 


F 


( 1 ) 


,+i, = csM{yil 

ci%, Qi% = CSM{^^p,Cr,Q^P,T,e) 

if and X^^^ both converge then 

converge = True 

end if 
end while 


Ki) nW n(i) 


’-( 2 ) ^( 2 ) ><y( 2 ) 


9 

10 

11 

12 


A = 


dX 




ax 


where let f = 


Tr((Xfc-Xfc_i)^(Xfc-Xfc_i)) 

|Tr((Xfe-Xfc_i)^(v.F(Xfe)-v.F(Xfc_i)))| 


, Tk = 


t(5^, 6 is the Barzilai-Borwein step size and h is the smallest 
integer to make Tk satisfy 


TiY{Tk))<CkFpTkF'AY{0)). 


Terms C, Q are defined as Gk+i = (pQkGk + J^(Xfc+i)) IQk+i 
and Qfc+i = pQk + l,Qo = 1- More detailed derivatives 
of the curvilinear search method (i.e.. Algorithm with 
Barzilai-Borwein step is available in [^. Meanwhile, the 
pseudo-code of method MCD is available in Algorithm 
Based on the achieved solutions X^^^ and X^^\ we can get 

H(1) = X(1) and X^^). 


4. EXPERIMENTS 

To demonstrate the effectiveness of MCD, we will conduct 
extensive experiments on two real-world partially aligned 
heterogeneous networks: Foursquare and Twitter, in this 
section. 

4.1 Dataset Description 

As mentioned in the Sectionj^ both Foursquare and Twit- 
ter used in this paper are heterogeneous social networks, 
whose statistical information is given in Table These two 
networks were crawled with the methods proposed in 
during November, 2012. The number of anchor links ob¬ 
tained is 3,388. Some basic descriptions about datasets are 
as follows: 


• Foursquare: Users together with their posts are crawled 
from Foursquare, whose number are 5, 392 and 48, 756 
respectively. The number of social link among users is 





















Table 3: Properties of the Heterogeneous Social 
Networks_ 




network 


property 

Twitter 

Foursquare 

# node 

user 

tweet / tip 
location 

5,223 

9,490,707 

297,182 

5,392 

48,756 

38,921 

# link 

friend / follow 

write 

locate 

164,920 

9,490,707 

615,515 

76,972 

48,756 

48,756 


76, 972. All these posts written by these users and can 
attach locations checkins and, as a result, the numbers 
of write link and locate link are both 48,756. 38,921 
different locations are crawled from Foursquare. 

• Twitter: 5, 223 users and all their tweets, whose num¬ 
ber is 9, 490, 707, are crawled from Twitter and, on av¬ 
erage, each user has about 1,817 tweets. Among these 
tweets, about 615, 515 have location check-ins, which 
accounts for about 6.48% of all tweets. The number 
of locations crawled from Twitter is 297,182 and the 
number of social links among users is 164,920. 

For more information about the datasets and crawling 
methods, please refer to . 

4.2 Experiment Settings 

4.2.1 Comparison Methods 

The comparison methods used in the experiments can be 
divided into three categories. 

Mutual Clustering Methods 

• MCD: MCD is the mutual community detection method 
proposed in this paper, which can detect the commu¬ 
nities of multiple aligned networks with consideration 
of the connections and characteristics of different net¬ 
works. Heterogeneous information in multiple aligned 
networks are applied in building MCD. 

Multi-Network Clustering Methods 

• SICLUS: the clustering method proposed in can 

calculate the similarity scores among users by propa¬ 
gating heterogeneous information across views/networks, 
In this paper, we extend the method proposed in [3^ 

and propose SICLUS to calculate the intimacy scores 
among users in multiple networks simultaneously, based 
on which, users can be grouped into different clusters 
with clustering models based on intimacy matrix fac¬ 
torization as introduced in [^. Heterogeneous infor¬ 
mation across networks is used to build SICLUS. 

Isolated Clustering Methods, which can detect commu¬ 
nities in each isolated network: 

• Ncut: Ncut is the clustering method based on nor¬ 
malized cut proposed in [^. Method Ncut can detect 
the communities in each social network merely based 
on the social connections in each network in the exper¬ 
iments. 
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• Kmeans: Kmeans is a traditional clustering method, 
which can be used to detect communities in social 
networks based on the social connections only in the 
experiments. 

4.2.2 Evaluation Methods 

The evaluation metrics applied in this paper can be di¬ 
vided into two categories: Quality Metrics and Consensus 
Metrics. 

Quality Metrics: 4 widely and commonly used quality 
metrics are applied to measure the clustering result, e.g., 
C — {Ui}f=i^ of each network. 


• normalized-dhi |38| : 

ndUiO = i 

A. j^i cTi O'j dyCi, Cj) + d[Cj, Ci) 

where Ci is the centroid of community Ui G C, d{ci,Cj) 
denotes the distance between centroids Ci and Cj and 
Oi represents the average distance between elements in 
Ui and centroid Ci. (Higher ndbi corresponds to better 
performance). 

• entropy [^ : 

K 

//(C) = -EUi)iogA*), 

i=l 

where P{i) = . (Lower entropy corresponds 

l^i=l \ ^i \ 

to better performance). 

• density [38] : 

dens{C) = E 

i=l I I 

where E and Ei are the edge sets in the network and 
Ui. (Higher density corresponds to better performance). 


• silhouette [15|: 


i=l ' 


uEUi 


b{u) — a{u) . 
max{a(i^), 5(i^)} 


where a{u) = T,veUi,n^v diu,v) and 

b{u) = Y^veu^ d(M,t>)). (Higher silhou- 

ette corresponds to better performance). 


Consensus Metrics: Given the clustering results = 
and = {U^‘^^}f^i\ the consensus metrics 
measuring the how similar or dissimilar the anchor users are 
clustered in and include: 

• rand [^: rand{C^\C^'>) = > where 

A’ii(iVoo) is the numbers of pairwise anchor users who 
are clustered in the same (different) community(ies) 
in both and Aoi(A'io) is that of anchor users 
who are clustered in the same community (different 
communities) in but in different communities (the 
same communities) in (Lower rand corresponds 

to better performance). 














Table 4: Community Detection Results of Foursquare and Twitter Evaluated by Quality Metrics. 

remaining anchor link rates a 


network 

measure 

methods 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

0.9 

1.0 



MCD 

0.927 

0.924 

0.95 

0.969 

0.966 

0.961 

0.958 

0.954 

0.971 

0.958 


ndbi 

SICLUS 

0.891 

0.889 

0.88 

0.877 

0.894 

0.883 

0.89 

0.88 

0.887 

0.893 


Ncut 

0.863 

0.863 

0.863 

0.863 

0.863 

0.863 

0.863 

0.863 

0.863 

0.863 



Kmeans 

0.835 

0.835 

0.835 

0.835 

0.835 

0.835 

0.835 

0.835 

0.835 

0.835 



MCD 

1.551 

1.607 

1.379 

1.382 

1.396 

1.382 

1.283 

1.552 

1.308 

1.497 

0 

entropy 

SICLUS 

4.332 

4.356 

4.798 

4.339 

4.474 

4.799 

4.446 

4.658 

4.335 

4.459 


Ncut 

2.768 

2.768 

2.768 

2.768 

2.768 

2.768 

2.768 

2.768 

2.768 

2.768 

U* 

tn 


Kmeans 

2.369 

2.369 

2.369 

2.369 

2.369 

2.369 

2.369 

2.369 

2.369 

2.369 

tS 


MCD 

0.216 

0.205 

0.196 

0.163 

0.239 

0.192 

0.303 

0.198 

0.170 

0.311 


density 

SICLUS 

0.116 

0.121 

0.13 

0.095 

0.143 

0.11 

0.13 

0.12 

0.143 

0.103 


Ncut 

0.154 

0.154 

0.154 

0.154 

0.154 

0.154 

0.154 

0.154 

0.154 

0.154 



Kmeans 

0.182 

0.182 

0.182 

0.182 

0.182 

0.182 

0.182 

0.182 

0.182 

0.182 



MCD 

-0.137 

-0.114 

-0.148 

-0.156 

-0.117 

-0.11 

-0.035 

-0.125 

-0.148 

-0.044 


silhouette 

SICLUS 

-0.168 

-0.198 

-0.173 

-0.189 

-0.178 

-0.181 

-0.21 

-0.195 

-0.167 

-0.18 


Ncut 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 

-0.34 



Kmeans 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 

-0.297 



MCD 

0.962 

0.969 

0.955 

0.969 

0.97 

0.958 

0.952 

0.96 

0.946 

0.953 


ndbi 

SICLUS 

0.815 

0.843 

0.807 

0.83 

0.826 

0.832 

0.835 

0.808 

0.812 

0.836 


Ncut 

0.759 

0.759 

0.759 

0.759 

0.759 

0.759 

0.759 

0.759 

0.759 

0.759 



Kmeans 

0.761 

0.761 

0.761 

0.761 

0.761 

0.761 

0.761 

0.761 

0.761 

0.761 



MCD 

2.27 

2.667 

2.48 

2.381 

2.43 

2.372 

2.452 

2.459 

2.564 

2.191 


entropy 

SICLUS 

4.780 

5.114 

5.066 

4.961 

4.904 

4.866 

5.121 

4.629 

4.872 

5.000 

0 

Ncut 

3.099 

3.099 

3.099 

3.099 

3.099 

3.099 

3.099 

3.099 

3.099 

3.099 



Kmeans 

3.245 

3.245 

3.245 

3.245 

3.245 

3.245 

3.245 

3.245 

3.245 

3.245 

H 


MCD 

0.14 

0.097 

0.142 

0.109 

0.15 

0.158 

0.126 

0.149 

0.147 

0.164 


density 

SICLUS 

0.055 

0.017 

0.044 

0.026 

0.04 

0.062 

0.016 

0.044 

0.045 

0.02 


Ncut 

0.107 

0.107 

0.107 

0.107 

0.107 

0.107 

0.107 

0.107 

0.107 

0.107 



Kmeans 

0.119 

0.119 

0.119 

0.119 

0.119 

0.119 

0.119 

0.119 

0.119 

0.119 



MCD 

-0.137 

-0.179 

-0.282 

-0.175 

-0.275 

-0.273 

-0.248 

-0.269 

-0.266 

-0.286 


silhouette 

SICLUS 

-0.356 

-0.322 

-0.311 

-0.347 

-0.346 

-0.349 

-0.323 

-0.363 

-0.345 

-0.352 


Ncut 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 

-0.424 



Kmeans 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 

-0.406 


• variation of information : 

(Lower vi corresponds to better performance). 

• mutual information [^ : 

r^( 2 ') 

P{hj) 


iC(l) k(2) 

= X] -P(ti)log 

i = l j = l 


p{i)p{jy 


where P(i,j) = 


1^1 


and C/f’l = 


I{w|m e uy\Bv e , (u,v) e I 

corresponds to better performance). 
• normalized mutual information [21|: 
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(Higher 




(Higher nmi corresponds to better performance). 


In the metrics introduced above, normalized-dbi, density, 
silhouette, mutual information and normalized mutual infor¬ 
mation are proportional metries, entropy, rand, and varia¬ 
tion of information are inversely proportional metries. 

To consider both the quality and consensus simultane¬ 
ously, we introduce a new clustering metric, IQC metrics, 
in this paper, which is inversely proportional. 

Definition 10 {IQC metrics): IQC is a linear combination 
of quality metrics Q and consensus metrics C. 

IQC{C^^\C^^y = 7(Q)(/3iQ(C(^^) + /32Q(C("^)) 

where ^i, ^ 2 , ^ 3 , Pa are weights of different terms, which are 
all set as 1 in this paper, and I{Q),I{C) = —1, if Q/C is 
proportional and 1, otherwise. 

IQC Metrics used in this paper include: 


Definition 9 (Proportional and Inversely Proportional Met¬ 
rics): Depending on relationship between the metric value 
and the clustering results, all the above metrics can be either 
proportional or inversely proportional. Metric M is propor¬ 
tional iff better clustering results corresponds to higher M 
value; M is inversely proportional iff better clustering result 
corresponds lower M value. 


• = -ndbiiC^^y - ndbiiC^^^ 
+ 2rand{C^^\C^"^y 

• = -dens(C(^>) - dens{C^‘^'>) 
— 2 nmi{C^^\ 



















Table 5: Community Detection Results of Foursquare and Twitter Evaluated by Consensus Metrics. 

remaining anchor link rates a 


measure 

methods 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

0.9 

1.0 


MCD 

0.095 

0.099 

0.107 

0.138 

0.116 

0.121 

0.132 

0.106 

0.089 

0.159 

rand 

SICLUS 

0.135 

0.139 

0.144 

0.148 

0.142 

0.14 

0.132 

0.132 

0.144 

0.141 

Ncut 

0.399 

0.377 

0.372 

0.4 

0.416 

0.423 

0.362 

0.385 

0.362 

0.341 


Kmeans 

0.436 

0.387 

0.4 

0.358 

0.403 

0.363 

0.408 

0.365 

0.35 

0.363 


MCD 

3.309 

4.052 

4.058 

3.902 

4.038 

4.348 

3.973 

3.944 

4.078 

2.911 

vi 

SICLUS 

7.56 

8.324 

8.414 

8.713 

8.756 

8.836 

8.832 

8.621 

8.427 

8.02 

Ncut 

5.384 

5.268 

5.221 

4.855 

5.145 

5.541 

5.909 

5.32 

5.085 

5.246 


Kmeans 

5.427 

5.117 

5.355 

5.326 

5.679 

5.944 

5.452 

5.567 

5.513 

4.686 


MCD 

0.152 

0.152 

0.149 

0.141 

0.149 

0.156 

0.142 

0.158 

0.147 

0.146 

nmi 

SICLUS 

0.172 

0.097 

0.081 

0.06 

0.056 

0.069 

0.078 

0.093 

0.105 

0.149 

Ncut 

0.075 

0.074 

0.111 

0.108 

0.109 

0.099 

0.05 

0.036 

0.042 

0.106 


Kmeans 

0.008 

0.047 

0.048 

0.054 

0.048 

0.028 

0.047 

0.014 

0.067 

0.119 


MCD 

0.756 

0.611 

0.4 

0.258 

0.394 

0.431 

0.381 

0.533 

0.697 

0.689 

mi 

SICLUS 

0.780 

0.446 

0.367 

0.277 

0.258 

0.325 

0.374 

0.44 

0.489 

0.698 

Ncut 

0.188 

0.181 

0.261 

0.232 

0.252 

0.243 

0.138 

0.092 

0.111 

0.31 


Kmeans 

0.02 

0.112 

0.119 

0.135 

0.127 

0.078 

0.119 

0.038 

0.194 

0.314 


Table 6: Community Detection Results of Foursquare and Twitter Evaluated by IQC Metrics. 







remaining anchor link rates a 




measure 

methods 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

0.9 

1.0 


MCD 

-1.699 

-1.695 

-1.691 

-1.662 

-1.705 

-1.676 

-1.647 

-1.703 

-1.738 

-1.594 


SICLUS 

-1.459 

-1.451 

-1.44 

-1.434 

-1.444 

-1.45 

-1.465 

-1.465 

-1.442 

-1.448 

Ncut 

-0.824 

-0.869 

-0.878 

-0.821 

-0.789 

-0.776 

-0.899 

-0.851 

-0.897 

-0.94 


Kmeans 

-0.724 

-0.821 

-0.795 

-0.88 

-0.79 

-0.87 

-0.779 

-0.865 

-0.895 

-0.869 


MCD 

10.439 

12.379 

11.975 

11.566 

11.902 

12.45 

11.681 

11.897 

12.028 

9.509 


SICLUS 

24.58 

26.107 

26.287 

26.884 

26.971 

27.13 

27.123 

26.7 

26.313 

25.499 

Ncut 

16.634 

16.403 

16.308 

15.577 

16.156 

16.948 

17.684 

16.506 

16.036 

16.359 


Kmeans 

16.468 

15.847 

16.325 

16.267 

16.972 

17.503 

16.519 

16.748 

16.641 

14.986 


MCD 

-0.659 

-0.606 

-0.636 

-0.555 

-0.686 

-0.663 

-0.713 

-0.664 

-0.611 

-0.768 


SICLUS 

-0.467 

-0.317 

-0.284 

-0.243 

-0.235 

-0.261 

-0.28 

-0.309 

-0.332 

-0.421 

Ncut 

-0.411 

-0.409 

-0.484 

-0.477 

-0.478 

-0.458 

-0.361 

-0.333 

-0.345 

-0.473 


Kmeans 

-0.317 

-0.395 

-0.397 

-0.41 

-0.398 

-0.357 

-0.396 

-0.329 

-0.436 

-0.54 


MCD 

-1.239 

-0.93 

-0.371 

-0.186 

-0.396 

-0.479 

-0.479 

-0.673 

-0.979 

-1.048 

IQCZ\- 

SICLUS 

-1.028 

-0.361 

-0.202 

-0.022 

0.016 

-0.118 

-0.216 

-0.347 

-0.446 

-0.863 

Ncut 

0.389 

0.403 

0.242 

0.3 

0.261 

0.278 

0.488 

0.58 

0.542 

0.144 


Kmeans 

0.664 

0.479 

0.465 

0.433 

0.45 

0.546 

0.466 

0.628 

0.316 

0.074 






(a) \\XW 


(b) 


Figure 3: 


and 


in each iteration. 


4.3 Experiment Results 

The experiment results are available in Tables |4|5| To 
show the effects of the anchor links, we use the same net¬ 
works but randomly sample a proportion of anchor links 
from the networks, whose number is controlled by cr G {0.1,0.2, 
• • • ,1.0}, where cr = 0.1 means that 10% of all the anchor 
links are preserved and cr = 1.0 means that all the anchor 
links are preserved. 

Table displays the clustering results of different meth¬ 


ods in Foursquare and Twitter respectively under the eval¬ 
uation of ndbi, entropy, density and silhouette. As shown in 
these two tables, MCD can achieve the highest ndbi score in 
both Foursquare and Twitter for different sample rate of an¬ 
chor links consistently. The entropy of the clustering results 
achieved by MCD is the lowest among all other comparison 
methods and is about 70% lower than SICLUS, 40% lower 
than Ncut and Kmeans in both Foursquare and Twitter. 
In each community detected by MCD, the social connec¬ 
tions are denser than that of SICLUS , Ncut and Kmeans. 
Similar results can be obtained under the evaluation of sil¬ 
houette, the silhouette score achieved by MCD is the highest 
among all comparison methods. So, MCD can achieve bet¬ 
ter results than modified multi-view and isolated clustering 
methods under the evaluation of quality metrics. 

Table shows the clustering results on the aligned net¬ 
works under the evaluation of consensus metrics, which in¬ 
clude rand, vi, nmi and mi. As shown in Table MCD 
can perform the best among all the comparison methods 
under the evaluation of consensus metrics. For example, the 
rand score of MCD is the lowest among all other methods 
and when a — 0.5, the rand score of MCD is 20% lower 
than SICLUS, 72% lower than Ncut and Kmeans. Simi¬ 
lar results can be obtained for other evaluation metrics, like 
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Figure 4: Analysis of parameters 


when cr = 0.5 , the vi score of MCD is about half of the 
the score of SICLUS; the nmi and mi score of MCD is the 
triple of that ofKMEANS. As a result, MCD can achieve 
better performance than both modified multi-view and iso¬ 
lated clustering methods under the evaluation of consensus 
metrics. 

Table is the clustering results of different methods evalu¬ 
ated by the IQC metrics. As shown in Table[^ the 
IQCtf -, IQC^i scores of MCD are all the low- 

est among all comparison methods. As mentioned above, 
lower IQC score corresponds to better clustering results, 
MCD can outperform all other baseline methods consis¬ 
tently under the evaluation of all IQC metrics. In sum, 
MCD can perform better than both modified multi-view 
and isolated clustering methods evaluated by IQC metrics. 

According to the results shown in Tables [4][6l we observe 
that the performance of MCD doesn’t varies much as a 
changes. The possible reason can be that, in method MCD, 
normalized clustering discrepancy is applied to infer the clus¬ 
tering confidence matrices. As a increases in the experi¬ 


ments, more anchor links are added between networks, part 
of whose effects will be neutralized by the normalization of 
clustering discrepancy and doesn’t affect the performance of 
MCD much. 

4.4 Convergence Analysis 

MCD can compute the solution of the optimization func¬ 
tion with Curvilinear Search method, which can update ma¬ 
trices and alternatively. This process will continue 
until convergence. To check whether this process can stop 
or not, in this part, we will analyze the convergence of X^^^ 
and X^^\ In Figure we show the norm of matrices 
X^^^ and X^^^, and ||x^^^|| , in each iteration of 

the updating algorithm, where the norm of matrix X 
is ||X||^ = ' As shown in Figures both 

||x(^)||^ and can converge in less than 200 itera¬ 

tions. 


4.5 Parameter Analysis 

In method MCD, we have three parameters: k^‘^^ 

and where k^^^ and k^‘^^ are the numbers of clusters in 
Foursquare and Twitter networks respectively, while 0 is 
the weight of the normalized discrepancy term in the ob¬ 
ject function. In the pervious experiment, we set k^^^ — 50, 
k^‘^^ — 50 and ^ = 1.0. Here we will analyze the sensitivity 
of these parameters in details. 

To analyze k^ we fix k^‘^^ = 50 and ^ = 1.0 but assign 
k^^^ with values in {10, 20,30,40, 50,60, 70,80,90,100}. The 
clustering results of MCD with different k^^^ evaluat ed by 


ndbi, rand and IQC^^^d metrics are given in Figures 4(a _ 
4(d)[ As shown in the figures, the results achieved by MCD 


are very stable for k^^^ with in range [40,100] under the 
evaluation of ndbi in both Foursquare and Twitter. Similar 
results can be obtained in Figures [4(c)|4(d)[ where the per¬ 
formance of MCD on aligned networks is not sensitive to 
the choice of k^^^ for k^^^ in range [40,100] under the evalu¬ 
ation of both rand and IQC ndbi,rand- In a similar way, we 
can study the sensitivity of parameter k^‘^\ the results about 
which are shown in Figures [4(e)|4(h)] 

An interesting phenomenon is that the pre-defined num¬ 
ber of clusters in the Foursquare network can also affect 
MCD’s performance in the Twitter network. As shown in 
Figure |4(b)| the performance of MCD is the best in the 
Twitter network when k^^^ is assigned with 30, as the ndbi 


score of MCD is the highest when k^^^ = 30. Figures 4(c _ 
4(d)| show the performance of MCD under the evaluation 


of rand and IQCndbi,rand- MCD performs the best when 
k^^^ = 40 under the evaluation of the rand metric and 
achieves the best performance when k^^^ = 40(or 90) evalu¬ 
ated by IQCndbi,rand‘ 

To analyze the parameter 6 , we set both k^^^ and k^‘^^ as 50 
but assign 0 with values in {0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 
1000.0}. The results are shown in Figure]^ where when 0 is 
small, e.g., 0.001, the ndbi scores achieved by MCD in both 
Foursquare and Twitter are high but the rand score is not 
good (rand is inversely proportional). On the other hand, 
large 0 can lead to good rand score but bad ndbi scores 
in both Foursquare and Twitter. As a result, (1) large 0 
prefers consensus results, (2) small 0 can preserve network 
characteristics and prefers high quality results. Meanwhile, 
considering the clustering quality and consensus simultane- 
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Figure 5: Analysis of parameter 0, 


ously, MCD can achieve the best performance wh en 0 = 1, 
as the IQC^^nd is the lowest when ^ = 1 in Figure 5(d) 


5. RELATED WORK 

Clustering is a very broad research area, which include 
various types of clustering problems, e.g., consensus clus¬ 
tering IT , multi-view clustering , multi-relational 
clustering 30 , co-training based clustering [^, and dozens 
of papers have been published on these topics. Lourenco et 
ah propose a probabilistic consensus clustering method 
by using evidence accumulation. Lock etal. propose a 
bayesian consensus clustering method in [^. Meanwhile, 
Bickel et al. propose to study the multi-view cluster¬ 
ing problem, where the attributes of objects are split into 
two independent subsets. Cai et al. propose to apply 
multi-view K-Means clustering methods to big data. Yin 
et al. propose a user-guided multi-relational clustering 
method, CrossClus, to performs multi-relational clustering 
under user’s guidance. Kumar et al. propose to address the 
multi-view clustering problem based on a co-training setting 



A multi-view clustering paper which is correlated to the 
problem studied in this paper is [^, which relaxes the one-to- 
one constraint in traditional multi-view clustering problems 
to uncertain mappings. Weights of such mappings need to 
be decided by prior domain knowledge and each view is ac¬ 
tually a homogeneous network. To regularize the clustering 
results, a cost function called elustering disagreement is in¬ 
troduced in , whose absolute value of all nodes in multiple 
views is involved in the optimization. Different from E- ( 1 ) 
the constraint on anchor links in this paper is one-to-one and 
no domain knowledge is required, (2) each network involves 
different users and contains heterogeneous information, (3) 
we apply clustering discrepancy to constrain the commu¬ 
nity structures of anchor users only and non-anchor users 
are pruned before calculating discrepancy cost, and (4) the 
clustering discrepancy is normalized before being applied in 
mutual clustering objective function. 

Clustering based community detection in online social net¬ 
works is a hot research topic and many different techniques 
have been proposed to optimize certain measures of the re¬ 
sults, e.g., modularity function [^, and normalized cut [23] . 
Malliaros et al. give a comprehensive survey of correlated 
techniques used to detect communities in networks in 
and a detailed tutorial on spectral clustering has been given 
by Luxburg in [^. These works are mostly studied based 
on homogeneous social networks. However, in the real-world 
online social networks, abundant heterogeneous information 
generated by users’ online social activities exist in online 


social networks. Sun et al. studies ranking-based clus¬ 
tering on heterogeneous networks, while Ji et al. stud¬ 
ies ranking-based classification problems on heterogeneous 
networks. Coscia et al. proposes a classification based 
method for community detection in complex networks and 
Mucha et al. study the community structures in multiplex 
networks in [19] . 

In recent years, researchers’ attention has started to shift 
to study multiple heterogeneous social networks simultane¬ 
ously. Kong et al. are the first to propose the concepts 
of aligned networks and anehor links. Across aligned social 
networks, different social network application problems have 
been studied, which include different cross-network link pre¬ 
diction/transfer [^, emerging network cluster¬ 

ing and large-scale network community detection [IT] , 
inter-network information diffusion and influence maximiza¬ 
tion [m] . 


6 . CONCLUSION 

In this paper, we have studied the mutual elustering prob¬ 
lem across multiple partially aligned heterogeneous online 
social networks. A novel clustering method, MCD, has been 
proposed to solve the mutual elustering problem. We have 
proposed a new similarity measure, HNMP-Sim, based on 
social met a paths in the networks. MCD can achieve very 
good clustering results in all aligned networks simultane¬ 
ously with full considerations of network difference problem 
as well as the connections across networks. Extensive exper¬ 
iments conducted on two real-world partially aligned hetero¬ 
geneous networks demonstrate that MCD can perform very 
well in solving the mutual elustering problem. 
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